AMENDMENTS 

Amendments to the Claims: 

Please replace the claims with the following listing of claims. 

1. (Currently amended) A document automatic classification system, comprising: 

list generation means for generating a word list for each of at least two categories by 
extracting words from a learning document set; 

unnecessary word determination means for relatively determining an unnecessary word 
for a category on the basis of the number of occurrences of a given word within at least one other 
category by using the list generated by said list generation means wherein said unnecessary word 
determination means determines a word is an unnecessary word in response to the word having a 
lesser number of occurrences than a given standard in the at least one other category, the given 
standard comprised of a predetermined threshold scaled by the number of documents in the at 
least one other category ; and 

means for generating a document classification catalog by eliminating words determined 
to be unnecessary words from each of the word lists. 

2. (Previously presented) The system according to Claim 1, wherein said list generation 
means generates a list indicating a frequency of appearance of a given word for each category. 

3. (Currently amended) The system according to Claim 1, wherein the document 
classification catalog is comprised of a plurality of vector spaces wherein each vector space 
represents at least one category unnecessary word determination means extracts a word 
belonging to a given category and determines it to be an unnecessary word in response to the 
word having a greater number of occurrences in another category than is allowed by a given 



4. (Currently amended) The system according to Claim 3, wherein a target classification 
document is defined by a document vector and wherein a distance is defined between the 



document vector and each of the plurality of vector spaces such that the distance indicates a 
degree of similarity between the target classification document and a category represented by the 
vector spaces the given standard is determined according to a predetermined threshold . 

5. (Previously presented) The system according to Claim 1, further comprising: 
classification catalog storage means for storing a list for each category from which 

unnecessary words were eliminated based on the determination with said unnecessary word 
determination means; and 

document classification means for performing classification processing for classification 
target documents by using said document classification catalog. 

6. (Currently amended) A document automatic classification system, comprising: 

a classified document set storage device for storing documents classified according to at 
least two categories; 

a category table generation unit for generating a table, the table comprising: 

word lists corresponding to each of the at least two categories wherein the 
word lists are generated by extracting words from a learning document set; and 

frequencies comprising the number of occurrences of each extracted word 
within the learning document set; 
an unnecessary word elimination unit for eliminating an unnecessary word from a 
category in the table on the basis of the number of occurrences within at least one other category 
of a given word, wherein said unnecessary word elimination unit extracts a word belonging to a 
given category and eliminates the word as an unnecessary word from said table in response to the 
word having a lesser number of occurrences than a given standard in the at least one other 
category, the given standard comprised of a predetermined threshold scaled by the number of 
documents in the at least one other category the word appearing more frequently in another 
category than is allowed by a given standard ; and 

a classification catalog storage device for storing the table from which the unnecessary 
word was eliminated by said unnecessary word elimination unit. 
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7. (Original) The system according to Claim 6, further comprising: 

a classification target document storage device for storing classification target documents 
to be classified; and 

a document classification processing unit for performing classification processing for the 
classification target documents stored in said classification target document storage device by 
using said table stored in said classification catalog storage device. 

8. (Cancelled) 

9. (Previously presented) The system according to Claim 6, wherein said table contains 
information on each word, a frequency of appearance of each word, and a part of speech of each 
word. 

10. (Previously presented) An unnecessary word determination method in a document 
automatic classification system, comprising the steps of: 

generating a word list for each of at least two categories by extracting words from a 
learning document set, the word list containing information on a frequency of appearance of each 
extracted word within each category; 

determining an unnecessary word for a category on the basis of the relative number of 
occurrences of a given word within at least one other category wherein a word is determined to 
be unnecessary in response to the word having a lesser number of occurrences than a given 
standard in the at least one other category, the given standard comprised of a predetermined 
threshold scaled by the number of documents in the at least one other category ; and 

eliminating words determined to be unnecessary words from each of the word lists. 

11. (Currently amended) The method according to Claim 10, further comprising 
generating a document classification catalog by eliminating the words determined to be 
unnecessary words from the word lists wherein, in said step of determining the unnecessary word, 
the unnecessary word is determined according to whether one word selected from the given 
category appears in said other categories more frequently than is allowed by a given standard . 
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12. (Currently amended) The method according to Claim 11, wherein the document 
classification catalog is comprised of a plurality of vector spaces wherein each vector space 
represents at least one category said given standard is a value obtained from a predetermined 
given threshold . 

13. (Currently amended) The method according to Claim 121, wherein a target 
classification document is defined by a document vector and wherein a distance is defined 
between the document vector and each of the plurality of vector spaces such that the distance 
indicates a degree of similarity between the target classification document and a category 
represented by the vector spaces said given standard is determined according to said frequency of 
the word in said other categories and a total frequency of all words in said other categories . 

14. (Currently amended) An unnecessary word determination method in a document 
automatic classification system, comprising the steps of: 

acquiring information on words from a document set, classifying the words according to 
category, and storing the words in a storage device; 

recognizing the number of occurrences within at least one other category of a word 
belonging to a given category on the basis of the acquired information; 

determining whether the word is unnecessary for identifying the given category on the 
basis of the recognized number of occurrences frequency wherein the word is determined to be 
unnecessary in response to the word having a lesser number of occurrences than a given standard 
in the at least one other category, the given standard comprised of a predetermined threshold 
scaled by the number of documents in the at least one other category ; and 

generating a document classification catalog by eliminating words determined to be 
unnecessary words. 

15. (Previously presented) The method according to Claim 14, further comprising 
storing said classification catalog into the storage device. 
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16. (Previously presented) The method according to Claim 15, further comprising the 
step of performing classification processing for classification target documents by using the 
classification catalog stored in said storage device. 
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